Back

PLOS Digital Health

88 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
Can AI Match Human Experts? Evaluating LLM-Generated Feedback on Resident Scholarly Projects
2026-03-04 medical education 10.64898/2026.03.04.26346878
Top 0.4% (13.0%)
Show abstract

BackgroundDelivering timely, high-quality feedback on resident scholarly projects is labour-intensive, especially in large programmes. We developed an AI-assisted evaluation system, powered by the open-weight LLaMA-3.1 large-language model (LLM), to generate formative feedback on Family Medicine residents scholarly projects and compared its performance with expert human evaluators. MethodsWe evaluated whether the AI-generated feedback achieves comparable quality to expert feedback. The tool ing...

2
Red-Teaming Medical AI: Systematic Adversarial Evaluation of LLM Safety Guardrails in Clinical Contexts
2026-03-05 health informatics 10.64898/2026.02.26.26347212
Top 0.5% (11.9%)
Show abstract

BackgroundLarge language models (LLMs) are increasingly deployed in medical contexts as patient-facing assistants, providing medication information, symptom triage, and health guidance. Understanding their robustness to adversarial inputs is critical for patient safety, as even a single safety failure can lead to adverse outcomes including severe harm or death. ObjectiveTo systematically evaluate the safety guardrails of state-of-the-art LLMs through adversarial red-teaming specifically designe...

3
Thyroid Cancer Risk Prediction from Multimodal Datasets Using Large Language Model
2026-03-06 health informatics 10.64898/2026.03.05.26347766
Top 0.9% (10.7%)
Show abstract

Thyroid carcinoma is one of the most prevalent endocrine malignancies worldwide, and accurate preoperative differentiation between benign and malignant thyroid nodules remains clinically challenging. Diagnostic methods that medical practitioners use at present depend on their personal judgment to evaluate both imaging results and separate clinical tests, which creates inconsistency that leads to incorrect medical evaluations. The combination of radiological imaging with clinical information syst...

4
Preparing for the Future: A Mixed Methods Study Protocol on AI Awareness and Educational Integration in Qatars Primary Health Care Workforce.
2026-03-07 health systems and quality improvement 10.64898/2026.03.06.26347773
Top 1% (9.7%)
Show abstract

Background Artificial intelligence (AI) is increasingly being integrated into healthcare systems, with growing applications in clinical decision support, workflow optimization, and population health management. While substantial investments have been made in digital infrastructure, the successful adoption of AI in primary care depends critically on the readiness, awareness, and educational preparedness of healthcare professionals. Global health authorities emphasize the need for ethically ground...

5
Medical concept understanding in large language models is fragmented
2026-03-05 health informatics 10.64898/2026.03.03.26347552
Top 1% (9.0%)
Show abstract

Large language models (LLMs) perform strongly across a wide range of medical applications, yet it remains unclear whether such success reflects genuine understanding of medical concepts. We present an ontology-grounded, concept-centered evaluation of medical concept understanding in LLMs. Using 6,252 phenotype concepts from Human Phenotype Ontology, we decompose concept understanding into three core dimensions--concept identity, concept hierarchy, and concept meaning--and design corresponding be...

6
Perceptions of Artificial Intelligence in the Editorial and Peer Review Process: A Cross-Sectional Survey of Traditional, Complementary, and Integrative Medicine Journal Editors
2026-03-04 health informatics 10.64898/2026.03.04.26347571
Top 1% (8.8%)
Show abstract

BackgroundArtificial intelligence chatbots (AICs) are increasingly being integrated into scholarly publishing, with the potential to automate routine editorial tasks and streamline workflows. In traditional, complementary, and integrative medicine (TCIM) publishing, editorial and peer review processes can be particularly complex due to diverse methodologies and culturally embedded knowledge systems, presenting unique opportunities and challenges for AIC adoption. MethodsAn anonymous, online cro...

7
Large language models for self-administered conversational vignette assessment of provider competencies: A pilot and validation study in Vietnam with automated LLM-powered transcript classification
2026-03-04 health economics 10.64898/2026.03.02.26347479
Top 2% (8.2%)
Show abstract

We developed and validated a self-administered clinical vignette platform powered by a large language model (LLM), deployed through a SurveyCTO web survey, to measure primary health care provider competencies in Vietnam. In a pilot focus group, nine physicians rated LLM-simulated patient interactions as realistic (mean 3.78/5) and user-friendly. In the validation phase, 22 providers completed 132 vignette interactions across ten clinical scenarios in Vietnamese. Essential diagnostic checklist sc...

8
Enhancing Prediabetes Diagnosis from Continuous Glucose Monitoring Data via Iterative Label Cleaning and Deep Learning
2026-03-05 health informatics 10.64898/2026.03.04.26347604
Top 2% (8.0%)
Show abstract

As of early 2026, over 115 million US adults (more than 1 in 3) have prediabetes, a condition with an annual conversion rate of 5%-10% to type 2 diabetes. Total diabetes (diagnosed and undiagnosed) affects approximately 40.1 million Americans, or 12% of the population, with roughly 1.5 million new cases diagnosed annually. Continuous Glucose Monitoring (CGM) provides real-time, 24/7 insights into glycemic variability, detecting dangerous highs, lows, and trends that HbA1c (a 3-month average) mis...

9
Evaluating a Locally Deployed 20-Billion Parameter Large Language Model for Automated Abstract Screening in Systematic Reviews
2026-03-04 health informatics 10.64898/2026.03.04.26347506
Top 2% (7.8%)
Show abstract

BackgroundSystematic reviews (SRs) are essential for evidence-based medicine but require extensive time and resources for abstract screening. Large language models (LLMs) offer potential for automating this process, yet concerns about data privacy, intellectual property protection, and reproducibility limit the use of cloud-based solutions in research settings. ObjectiveTo evaluate the performance of a locally deployed 20-billion parameter LLM for automated abstract screening in systematic revi...

10
Personalized Insights Derived from Wearable Device Data and Large Language Models to Improve Well-Being
2026-03-04 health informatics 10.64898/2026.03.03.26347299
Top 2% (7.8%)
Show abstract

Health behaviors such as physical activity and sleep affect mental health, but the effect of each health behavior varies substantially across individuals, limiting the usefulness of generic behavioral recommendations. We collected one year of continuous wearable and ecological momentary assessment data from 3,139 participants in the Intern Health Study (2018-2023), and examined individual-level associations between wearable-derived features and mood across the internship year. The behaviors asso...

11
Class imbalance correction in artificial intelligence models leads to miscalibrated clinical predictions: a real-world evaluation
2026-03-05 health informatics 10.64898/2026.03.04.26347634
Top 2% (7.6%)
Show abstract

BackgroundPredictive models employing machine learning algorithms are increasingly being used in clinical decision making, and improperly calibrated models can result in systematic harm. We sought to investigate the impact of class imbalance correction, a commonly applied preprocessing step in machine learning model development, on calibration and modelled clinical decision making in a large real-world context. MethodsA histogram boosted gradient classifier was trained on a highly imbalanced na...

12
Population differences in wearable device wear time: Rescuing data to address biases and advance health equity
2026-03-06 health informatics 10.64898/2026.03.06.26347799
Top 3% (6.8%)
Show abstract

Wearable devices present transformative opportunities for personalized healthcare through continuous monitoring of digital biomarkers; however, individual variations in device wear time could mask or otherwise impact signal identification. Despite the widespread adoption of wearable devices in research, no comprehensive framework exists for understanding how wear time varies across populations or for addressing wear time-related biases in analysis. Using Fitbit data from 11,901 participants in t...

13
Digital monitoring and action planning to reach zero-dose and under-immunised children: Leveraging data for targeted immunisation responses
2026-03-07 health systems and quality improvement 10.64898/2026.03.03.26346932
Top 3% (6.8%)
Show abstract

Background Persistent inequities in immunisation coverage, particularly among zero-dose and under-immunised children, continue to challenge Pakistan's Expanded Programme on Immunization. Weak feedback loop, inconsistent data quality, and limited real-time monitoring impede effective decision-making. This Implementation Research was conducted under the MAINSTREAM Initiative funded by Alliance for Health Policy and Systems Research (AHPSR) and supported by the Aga Khan Community Health Services De...

14
A Qualitative Study of Patient and Healthcare Provider Perspectives on Mobile Health Assessments for Cervical Spondylotic Myelopathy
2026-03-05 health informatics 10.64898/2026.03.04.26347622
Top 3% (6.6%)
Show abstract

Objective: Evaluating and monitoring patients with cervical spondylotic myelopathy (CSM) remains a challenge due to limited tools for assessing objective neurological disability longitudinally and in the home environment. Given their prevalence and low cost, mobile health (mHealth), and specifically smartphone technologies offer a promising approach to fill this gap. This study explored stakeholder perspectives on the role of mHealth in CSM monitoring to inform development of a smartphone-based ...

15
Intelligent Guidance and Diagnostic Assistance for Handheld Ultrasound: Actor-Critic Based Approach for Carotid Artery and Thyroid Examination
2026-03-04 radiology and imaging 10.64898/2026.03.02.26347395
Top 3% (6.5%)
Show abstract

Handheld ultrasound devices have revolutionized point-of-care diagnostics, but their effectiveness remains limited by operator dependency and the need for specialized training. This paper presents an intelligent guidance and diagnostic assistance system for the handheld wireless ultrasound device, enabling automated carotid artery and thyroid examinations through handheld operation. Drawing inspiration from the Actor-Critic framework, we implement a simulation-based reinforcement learning approa...

16
Trustworthy personalized treatment selection: causal effect-trees and calibration in perioperative medicine
2026-03-04 health informatics 10.64898/2026.03.03.26347440
Top 3% (6.5%)
Show abstract

BackgroundPersonalized medicine promises to tailor treatments to the individual, but it carries a hidden risk: mistaking statistical noise for actionable clinical insight. Current machine learning approaches often provide predictions, but fail to inform clinicians when those predictions are unreliable. ObjectiveDevelop a deployment-readiness framework that integrates causal inference, interpretable effect-trees, and calibration assessment to distinguish actionable signal from unreliable variati...

17
Walking in the Free World: Establishing Normative Trajectories for Ecological Assessment of Robust Gait Variability with Age
2026-03-06 geriatric medicine 10.64898/2026.03.06.26347806
Top 3% (6.3%)
Show abstract

Gait variability is a critical functional indicator of dynamic balance and neurocognitive decline in health. Its translation into clinical practice is, however, challenged by a lack of age-related normative trajectories and reference values under real-world ecological settings. Furthermore, the conventional metrics used to estimate gait variability (Coefficient of Variation, CV; Standard Deviation, SD) have a fundamental methodological flaw: the inherent sensitivity of conventional metrics to th...

18
Show Your Work: Verbatim Evidence Requirements and Automated Assessment for Large Language Models in Biomedical Text Processing
2026-03-04 health informatics 10.64898/2026.03.03.26346690
Top 4% (6.0%)
Show abstract

PurposeLarge language models (LLMs) are used for biomedical text processing, but individual decisions are often hard to audit. We evaluated whether enforcing a mechanically checkable "show your work" quote affects accuracy, stability, and verifiability for trial eligibility-scope classification from abstracts. MethodsWe used 200 oncology randomized controlled trials (2005 - 2023) and provided models with only the title and abstract. Trials were labeled with whether they allowed for the inclusio...

19
Variability in Automated Sepsis Case Detection: A Systematic Analysis of Implementation Methods in Clinical Data Repositories
2026-03-04 health informatics 10.64898/2026.02.27.26347259
Top 5% (5.5%)
Show abstract

ObjectiveTo systematically identify and characterize methodological heterogeneity in sepsis case detection methods using the MIMIC-III database or the eICU-CRD, and to quantify the resulting variability in sepsis detection rates. Materials and MethodsWe conducted a PRISMA-guided systematic review of PubMed and Web of Science (2016-2024), and stratified studies by cohort definition to obtain comparable subsets. We extracted information on sepsis case detection methodology across six domains: par...

20
Using the ECHILD Database to Explore Educational and Health Outcomes of Unaccompanied Asylum-Seeking Children living in England (2005 to 2021)
2026-03-04 health informatics 10.64898/2026.03.04.26347576
Top 6% (4.5%)
Show abstract

UK-based quantitative research on the health and education outcomes of Unaccompanied Asylum-Seeking Children (UASC) remains limited, especially at national level. Linked administrative data provide an unprecedented opportunity to study these outcomes among UASC. This paper lays a foundation for further research, particularly examining the influence of socio-demographic, legal and environmental factors on UASCs health and educational outcomes. We described the UASC population with a first record...